Truncated Exponential Distribution (truncexpon)#

The truncated exponential distribution is what you get when you take an exponential waiting time and then condition it to lie in a finite interval.

It is a natural model for positive quantities that are approximately exponential, but where a hard maximum exists (timeouts, finite observation windows, physical limits). SciPy exposes this distribution as scipy.stats.truncexpon.

What you’ll learn#

  • How truncexpon relates to the exponential distribution (conditioning/truncation)

  • PDF, CDF, inverse CDF, and how to sample it with NumPy only

  • Closed-form mean/variance and how to compute higher moments stably

  • How SciPy parameterizes truncexpon, including fitting

  • Practical modeling and inference patterns (likelihood, testing, Bayesian grids)

Notebook roadmap#

  1. Title & classification

  2. Intuition & motivation

  3. Formal definition (PDF/CDF)

  4. Moments & properties

  5. Parameter interpretation

  6. Derivations (expectation, variance, likelihood)

  7. Sampling & simulation (NumPy-only)

  8. Visualization (PDF, CDF, Monte Carlo)

  9. SciPy integration (scipy.stats.truncexpon)

  10. Statistical use cases

  11. Pitfalls

  12. Summary

import math

import numpy as np
import scipy
from scipy import stats
from scipy.optimize import minimize_scalar

import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

SEED = 7
rng = np.random.default_rng(SEED)

np.set_printoptions(precision=6, suppress=True)

print("numpy ", np.__version__)
print("scipy ", scipy.__version__)
numpy  1.26.2
scipy  1.15.0

Prerequisites & notation#

Prerequisites

  • comfort with basic calculus (integration by parts)

  • basic probability (PDF/CDF, expectation)

  • familiarity with numerical stability (expm1, log1p) is helpful

Two parameterizations

  1. Rate/truncation form (common in probability texts)

  • Rate: (\lambda > 0)

  • Upper truncation: (B > 0)

  • Support: (x \in [0, B])

  1. SciPy form (scipy.stats.truncexpon)

  • Shape: (b > 0) (dimensionless)

  • Location: (\mathrm{loc} \in \mathbb{R})

  • Scale: (\mathrm{scale} > 0)

  • Support: (x \in [\mathrm{loc},; \mathrm{loc} + b,\mathrm{scale}])

The mapping between the two is:

  • (\lambda = 1/\mathrm{scale})

  • (B = b,\mathrm{scale})

  • shift by loc if the lower endpoint is not zero

So you can interpret (b = \lambda B) as “how many exponential scales fit inside the truncation window”.

1) Title & classification#

  • Name: truncexpon (truncated exponential)

  • Type: continuous

  • Support:

    • canonical: (x \in [0, B])

    • SciPy: (x \in [\mathrm{loc},; \mathrm{loc} + b,\mathrm{scale}])

  • Parameter space:

    • canonical: (\lambda \in (0,\infty)), (B \in (0,\infty))

    • SciPy: (b \in (0,\infty)), loc (\in \mathbb{R}), scale (\in (0,\infty))

2) Intuition & motivation#

What it models#

Start with an exponential waiting time (Y \sim \mathrm{Exp}(\lambda)) (memoryless waiting time). Now impose a hard maximum (B) and look at the conditional distribution:

[ X ;=; Y \mid (Y \le B). ]

That is exactly the truncated exponential.

Typical real-world use cases#

  • Timeout-limited durations: network requests or jobs that are aborted at a timeout

  • Finite observation windows: inter-arrival times observed only up to the end of a study period

  • Physical constraints: decay/lifetime-like mechanisms with an imposed maximum by design

  • Simulation of bounded positive features: bounded right-skewed variables

Relations to other distributions#

  • Exponential: as (B \to \infty) (or (b \to \infty)), truncexpon approaches the exponential distribution.

  • Uniform: as (B \to 0) (or (b \to 0)), the density becomes nearly constant on ([0,B]), approaching a uniform distribution.

  • Truncation vs censoring: truncated means values above (B) are not present (conditioned away). Censoring means you know an observation exceeded (B) but not its exact value.

3) Formal definition#

PDF (rate/truncation form)#

Let (X) be an exponential random variable with rate (\lambda), conditioned to lie in ([0,B]). Define the normalizing constant:

[ Z(\lambda, B) = \mathbb{P}(Y\le B) = 1 - e^{-\lambda B}. ]

Then the PDF is

[ f(x\mid \lambda,B) = \begin{cases} \dfrac{\lambda e^{-\lambda x}}{1 - e^{-\lambda B}}, & 0\le x\le B \ 0, & \text{otherwise.} \end{cases} ]

CDF#

[ F(x\mid \lambda,B) = \begin{cases} 0, & x < 0 \ \dfrac{1 - e^{-\lambda x}}{1 - e^{-\lambda B}}, & 0\le x\le B \ 1, & x > B. \end{cases} ]

Inverse CDF (for sampling)#

For (U\sim\mathrm{Uniform}(0,1)), solve (U = F(X)):

[ U = \frac{1 - e^{-\lambda X}}{1 - e^{-\lambda B}} ;\Rightarrow; X = -\frac{1}{\lambda}\ln\Big(1 - U,(1 - e^{-\lambda B})\Big). ]

SciPy parameterization#

SciPy’s base distribution is the (\lambda=1) version truncated at (b): support ([0,b]). With loc and scale, SciPy uses

  • (x = \mathrm{loc} + \mathrm{scale},z)

  • (z \in [0,b])

So the support is ([\mathrm{loc},; \mathrm{loc} + b,\mathrm{scale}]).

def _one_minus_exp_neg(x: np.ndarray | float) -> np.ndarray | float:
    """Compute 1 - exp(-x) stably for x>=0."""
    return -np.expm1(-x)


def truncexpon_pdf(x: np.ndarray | float, b: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    """PDF for SciPy-style truncexpon(b, loc, scale). NumPy-only, vectorized."""
    if not (b > 0):
        raise ValueError("b must be > 0")
    if not (scale > 0):
        raise ValueError("scale must be > 0")

    x = np.asarray(x, dtype=float)
    z = (x - loc) / scale
    in_support = (z >= 0.0) & (z <= b)

    Z = _one_minus_exp_neg(b)  # 1 - exp(-b)
    out = np.zeros_like(z)
    out[in_support] = np.exp(-z[in_support]) / (scale * Z)
    return out


def truncexpon_cdf(x: np.ndarray | float, b: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    """CDF for SciPy-style truncexpon(b, loc, scale). NumPy-only, vectorized."""
    if not (b > 0):
        raise ValueError("b must be > 0")
    if not (scale > 0):
        raise ValueError("scale must be > 0")

    x = np.asarray(x, dtype=float)
    z = (x - loc) / scale

    Z = _one_minus_exp_neg(b)
    out = np.zeros_like(z)

    out = np.where(z < 0.0, 0.0, out)
    out = np.where(z >= b, 1.0, out)

    mid = (z >= 0.0) & (z < b)
    out[mid] = _one_minus_exp_neg(z[mid]) / Z
    return out


def truncexpon_ppf(u: np.ndarray | float, b: float, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
    """Inverse CDF (PPF) for SciPy-style truncexpon(b, loc, scale)."""
    if not (b > 0):
        raise ValueError("b must be > 0")
    if not (scale > 0):
        raise ValueError("scale must be > 0")

    u = np.asarray(u, dtype=float)
    if np.any((u < 0) | (u > 1)):
        raise ValueError("u must be in [0, 1]")

    Z = _one_minus_exp_neg(b)
    z = -np.log1p(-u * Z)  # stable: -log(1 - u*(1-exp(-b)))
    return loc + scale * z


def truncexpon_rvs_np(
    n: int,
    b: float,
    loc: float = 0.0,
    scale: float = 1.0,
    rng: np.random.Generator | None = None,
) -> np.ndarray:
    """NumPy-only sampling via inverse transform."""
    if rng is None:
        rng = np.random.default_rng()
    u = rng.random(n)
    return truncexpon_ppf(u, b=b, loc=loc, scale=scale)


def truncexpon_raw_moments_std(b: float, max_k: int = 4) -> list[float]:
    """Raw moments E[Z^k] for Z ~ truncexpon(b, loc=0, scale=1).

    Uses integer-k closed forms:
      ∫_0^b x^k e^{-x} dx = k! - e^{-b} * P_k(b)
    where P_k is a degree-k polynomial with factorial-like coefficients.
    """
    if not (b > 0):
        raise ValueError("b must be > 0")
    if max_k < 1:
        raise ValueError("max_k must be >= 1")

    q = math.exp(-b)
    Z = -math.expm1(-b)

    moments: list[float] = []

    if max_k >= 1:
        m1 = (1.0 - (b + 1.0) * q) / Z
        moments.append(m1)
    if max_k >= 2:
        m2 = (2.0 - (b * b + 2.0 * b + 2.0) * q) / Z
        moments.append(m2)
    if max_k >= 3:
        m3 = (6.0 - (b**3 + 3.0 * b * b + 6.0 * b + 6.0) * q) / Z
        moments.append(m3)
    if max_k >= 4:
        m4 = (24.0 - (b**4 + 4.0 * b**3 + 12.0 * b * b + 24.0 * b + 24.0) * q) / Z
        moments.append(m4)

    return moments


def truncexpon_mean_var_skew_kurt(b: float) -> tuple[float, float, float, float]:
    """Mean, variance, skewness, (excess) kurtosis for Z ~ truncexpon(b,0,1)."""
    m1, m2, m3, m4 = truncexpon_raw_moments_std(b, max_k=4)
    var = m2 - m1**2
    if var <= 0:
        raise FloatingPointError("non-positive variance")

    mu3 = m3 - 3 * m2 * m1 + 2 * m1**3
    mu4 = m4 - 4 * m3 * m1 + 6 * m2 * m1**2 - 3 * m1**4

    skew = mu3 / (var ** 1.5)
    kurt = mu4 / (var**2)
    excess_kurt = kurt - 3.0
    return m1, var, skew, excess_kurt


def truncexpon_entropy_std(b: float) -> float:
    """Differential entropy for Z ~ truncexpon(b,0,1) in nats."""
    mean, _, _, _ = truncexpon_mean_var_skew_kurt(b)
    # h = E[X] + log(1 - exp(-b))
    return mean + math.log(-math.expm1(-b))


def truncexpon_mgf(t: np.ndarray | float, lam: float, B: float) -> np.ndarray:
    """MGF for X ~ Exp(lam) conditioned on X<=B (finite B)."""
    if not (lam > 0):
        raise ValueError("lam must be > 0")
    if not (B > 0):
        raise ValueError("B must be > 0")

    t = np.asarray(t, dtype=complex)
    Z = _one_minus_exp_neg(lam * B)

    denom = lam - t
    out = np.empty_like(t)

    close = np.isclose(denom, 0.0)
    out[close] = (lam * B) / Z

    not_close = ~close
    out[not_close] = (lam / denom[not_close]) * _one_minus_exp_neg(denom[not_close] * B) / Z
    return out

4) Moments & properties#

Let (X \sim \mathrm{TruncExp}(\lambda,B)) as defined above. Write (b = \lambda B) and note (b) is dimensionless.

Mean and variance#

A convenient closed form is:

[ \mathbb{E}[X] = \frac{1}{\lambda} - \frac{B}{e^{\lambda B} - 1}. ]

[ \mathrm{Var}(X) = \frac{1}{\lambda^2}\left(1 - \frac{b^2 e^{b}}{(e^{b}-1)^2}\right),\quad b=\lambda B. ]

As checks:

  • as (B\to\infty) ((b\to\infty)), the formulas approach the exponential mean (1/\lambda) and variance (1/\lambda^2)

  • as (B\to 0), the distribution approaches (\mathrm{Unif}(0,B))

Higher moments, skewness, kurtosis#

For the standardized SciPy base (Z \sim \mathrm{truncexpon}(b,0,1)) (i.e. (\lambda=1), (B=b)), raw moments have closed forms:

[ \mathbb{E}[Z^k] = \frac{\int_0^b x^k e^{-x}dx}{1-e^{-b}} = \frac{\gamma(k+1,b)}{1-e^{-b}}, ]

and for integer (k) these simplify to polynomials times (e^{-b}). From raw moments you can compute skewness and kurtosis.

Scaling by scale and shifting by loc affects mean/variance in the usual way; skewness and kurtosis are invariant under positive scaling and shifts.

MGF and characteristic function#

For (t \ne \lambda):

[ M_X(t) = \mathbb{E}[e^{tX}] = \frac{\lambda}{\lambda - t},\frac{1 - e^{-(\lambda-t)B}}{1 - e^{-\lambda B}}. ]

The characteristic function is (\varphi_X(\omega) = M_X(i\omega)).

Entropy (differential, in nats)#

Using (\log f(x) = \log\lambda - \lambda x - \log(1-e^{-\lambda B})),

[ H(X) = -\mathbb{E}[\log f(X)] = -\log\lambda + \lambda,\mathbb{E}[X] + \log(1-e^{-\lambda B}). ]

Other properties#

  • Mode: at the left endpoint ((0) or loc)

  • Not memoryless: truncation breaks the strict memoryless property

  • Limits:

    • (b\to\infty): exponential

    • (b\to 0): approximately uniform

b_values = [0.3, 1.0, 3.0, 10.0]

rows = []
for b in b_values:
    mean, var, skew, exkurt = truncexpon_mean_var_skew_kurt(b)
    ent = truncexpon_entropy_std(b)

    dist = stats.truncexpon(b)
    sci_mean, sci_var, sci_skew, sci_exkurt = dist.stats(moments="mvsk")
    sci_ent = dist.entropy()

    rows.append(
        {
            "b": b,
            "mean (ours)": float(mean),
            "mean (scipy)": float(sci_mean),
            "var (ours)": float(var),
            "var (scipy)": float(sci_var),
            "skew (ours)": float(skew),
            "skew (scipy)": float(sci_skew),
            "exkurt (ours)": float(exkurt),
            "exkurt (scipy)": float(sci_exkurt),
            "entropy (ours)": float(ent),
            "entropy (scipy)": float(sci_ent),
        }
    )

px.table(rows).update_layout(title="Moments/entropy: NumPy formulas vs SciPy").show()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 28
     10     sci_ent = dist.entropy()
     12     rows.append(
     13         {
     14             "b": b,
   (...)     25         }
     26     )
---> 28 px.table(rows).update_layout(title="Moments/entropy: NumPy formulas vs SciPy").show()

AttributeError: module 'plotly.express' has no attribute 'table'

5) Parameter interpretation#

Shape b (SciPy)#

In the SciPy base distribution (loc=0, scale=1), the support is ([0,b]). So:

  • larger b means less truncation, more like a full exponential

  • smaller b means strong truncation, approaching a uniform distribution on ([0,b])

In the rate/truncation form, b corresponds to (b=\lambda B): truncation measured in units of the exponential scale.

scale#

scale stretches the distribution: if (Z) is the base distribution on ([0,b]), then (X = \mathrm{loc} + \mathrm{scale}\cdot Z).

  • Increasing scale increases all lengths (mean, standard deviation) proportionally.

  • Skewness and kurtosis do not change with scale.

loc#

loc shifts the distribution and the support by a constant.

x = np.linspace(0, 8, 600)

b_list = [0.5, 1.5, 4.0, 10.0]
fig = go.Figure()
for b in b_list:
    fig.add_trace(
        go.Scatter(
            x=x,
            y=truncexpon_pdf(x, b=b, loc=0.0, scale=1.0),
            mode="lines",
            name=f"b={b:g}",
        )
    )
    fig.add_vline(x=b, line_dash="dot", opacity=0.25)

fig.update_layout(
    title="Truncated exponential PDF (loc=0, scale=1) for different b",
    xaxis_title="x",
    yaxis_title="f(x)",
)
fig.show()

6) Derivations#

We sketch the key derivations for the rate/truncation form.

Expectation#

[ \mathbb{E}[X] = \int_0^B x,\frac{\lambda e^{-\lambda x}}{1-e^{-\lambda B}},dx. ]

Compute the numerator via integration by parts. Let (u=x), (dv=\lambda e^{-\lambda x},dx), so (du=dx) and (v=-e^{-\lambda x}):

[ \int_0^B x,\lambda e^{-\lambda x},dx = \left[-x e^{-\lambda x}\right]_0^B + \int_0^B e^{-\lambda x},dx = -B e^{-\lambda B} + \frac{1-e^{-\lambda B}}{\lambda}. ]

Divide by (1-e^{-\lambda B}):

[ \mathbb{E}[X] = \frac{1}{\lambda} - \frac{B e^{-\lambda B}}{1-e^{-\lambda B}} = \frac{1}{\lambda} - \frac{B}{e^{\lambda B}-1}. ]

Second moment and variance#

Similarly,

[ \mathbb{E}[X^2] = \int_0^B x^2,\frac{\lambda e^{-\lambda x}}{1-e^{-\lambda B}},dx = \frac{2}{\lambda^2} - \frac{B^2 + 2B/\lambda}{e^{\lambda B}-1}. ]

Then (\mathrm{Var}(X) = \mathbb{E}[X^2] - (\mathbb{E}[X])^2), which simplifies to

[ \mathrm{Var}(X) = \frac{1}{\lambda^2}\left(1 - \frac{(\lambda B)^2 e^{\lambda B}}{(e^{\lambda B}-1)^2}\right). ]

Likelihood#

Given i.i.d. observations (x_1,\dots,x_n\in[0,B]), the log-likelihood for (\lambda) (with (B) fixed) is:

[ \ell(\lambda) = \sum_{i=1}^n \log f(x_i\mid\lambda,B) = n\log\lambda - \lambda\sum_i x_i - n\log(1-e^{-\lambda B}). ]

Setting the derivative to zero yields an equation for the MLE (\hat\lambda):

[ 0 = \frac{n}{\lambda} - \sum_i x_i - n,\frac{B e^{-\lambda B}}{1-e^{-\lambda B}}. ]

There is generally no closed form, but the objective is smooth and easy to optimize numerically.

def truncexp_loglik_lambda(x: np.ndarray, lam: float, B: float) -> float:
    x = np.asarray(x, dtype=float)
    if not (lam > 0):
        return -np.inf
    if not (B > 0):
        return -np.inf
    if np.any((x < 0) | (x > B)):
        return -np.inf

    # log f = log lam - lam x - log(1-exp(-lam B))
    # use log1p/expm1 for stability
    logZ = np.log(_one_minus_exp_neg(lam * B))
    return float(x.size * np.log(lam) - lam * x.sum() - x.size * logZ)


# Demonstration: simulate data with known (lam,B) and estimate lam by MLE
lam_true = 1.4
B = 2.5

# In SciPy form: scale=1/lam, b=lam*B
b = lam_true * B
x = truncexpon_rvs_np(n=800, b=b, loc=0.0, scale=1 / lam_true, rng=rng)

def neg_loglik(lam: float) -> float:
    return -truncexp_loglik_lambda(x, lam=lam, B=B)

res = minimize_scalar(neg_loglik, bounds=(1e-6, 20.0), method="bounded")
lam_hat = float(res.x)

lam_true, lam_hat, res.success
(1.4, 1.4472584711835175, True)

7) Sampling & simulation (NumPy-only)#

Inverse transform sampling uses the inverse CDF derived earlier.

For SciPy parameters (b, loc, scale):

[ X = \mathrm{loc} + \mathrm{scale}\cdot \Big[-\ln(1 - U(1-e^{-b}))\Big],\quad U\sim\mathrm{Unif}(0,1). ]

Numerical note: compute (1-e^{-b}) with -expm1(-b) and use log1p for the log.

b = 3.0
u_grid = np.linspace(0.001, 0.999, 1000)

ppf_ours = truncexpon_ppf(u_grid, b=b, loc=0.0, scale=1.0)
ppf_scipy = stats.truncexpon(b).ppf(u_grid)

np.max(np.abs(ppf_ours - ppf_scipy))
4.440892098500626e-16

8) Visualization (PDF, CDF, Monte Carlo)#

We’ll visualize:

  • the PDF and CDF for multiple b

  • Monte Carlo samples vs the theoretical density

b_list = [0.5, 1.0, 3.0, 8.0]
x = np.linspace(0, 8, 800)

fig_pdf = go.Figure()
fig_cdf = go.Figure()

for b in b_list:
    fig_pdf.add_trace(
        go.Scatter(x=x, y=truncexpon_pdf(x, b=b), mode="lines", name=f"b={b:g}")
    )
    fig_cdf.add_trace(
        go.Scatter(x=x, y=truncexpon_cdf(x, b=b), mode="lines", name=f"b={b:g}")
    )

fig_pdf.update_layout(title="PDF for different b (loc=0, scale=1)", xaxis_title="x", yaxis_title="f(x)")
fig_cdf.update_layout(title="CDF for different b (loc=0, scale=1)", xaxis_title="x", yaxis_title="F(x)")

fig_pdf.show()
fig_cdf.show()
# Monte Carlo vs theoretical
b = 3.0
n = 60_000
x_samp = truncexpon_rvs_np(n=n, b=b, rng=rng)

grid = np.linspace(0, b, 600)
pdf_grid = truncexpon_pdf(grid, b=b)

hist_y, hist_edges = np.histogram(x_samp, bins=70, range=(0, b), density=True)
hist_x = 0.5 * (hist_edges[:-1] + hist_edges[1:])

fig = go.Figure()
fig.add_trace(go.Bar(x=hist_x, y=hist_y, name="Monte Carlo (density)", opacity=0.55))
fig.add_trace(go.Scatter(x=grid, y=pdf_grid, mode="lines", name="Theoretical PDF"))

fig.update_layout(
    title=f"Monte Carlo vs PDF (b={b:g}, n={n:,})",
    xaxis_title="x",
    yaxis_title="density",
)
fig.show()

# Moment check
mean_theory, var_theory, *_ = truncexpon_mean_var_skew_kurt(b)
mean_mc = float(x_samp.mean())
var_mc = float(x_samp.var())

mean_theory, mean_mc, var_theory, var_mc
(0.8428129105262321,
 0.8452472222924361,
 0.5037309504814621,
 0.5059742689284107)

9) SciPy integration (scipy.stats.truncexpon)#

SciPy exposes this distribution as scipy.stats.truncexpon.

Key points:

  • Shape parameter: b

  • Location/scale: loc, scale

  • Support: ([\mathrm{loc},; \mathrm{loc}+b,\mathrm{scale}])

Useful methods:

  • pdf, logpdf

  • cdf, ppf

  • rvs

  • stats(moments="mvsk")

  • fit (MLE)

b_true, loc_true, scale_true = 4.0, 1.5, 2.0
dist = stats.truncexpon(b_true, loc=loc_true, scale=scale_true)

x0 = np.array([loc_true, loc_true + 0.5 * scale_true, loc_true + b_true * scale_true])

pd = dist.pdf(x0)
cd = dist.cdf(x0)
rv = dist.rvs(size=5, random_state=rng)

pd, cd, rv
(array([0.509329, 0.308923, 0.009329]),
 array([0.     , 0.40081, 1.     ]),
 array([3.257258, 1.704176, 4.265893, 4.66483 , 2.184419]))
# Parameter fitting example
n = 2_000
x = stats.truncexpon(b_true, loc=loc_true, scale=scale_true).rvs(size=n, random_state=rng)

# Fitting all params can be hard; fixing loc when known often helps.
b_hat1, loc_hat1, scale_hat1 = stats.truncexpon.fit(x)
b_hat2, loc_hat2, scale_hat2 = stats.truncexpon.fit(x, floc=loc_true)

(
    (b_true, loc_true, scale_true),
    (b_hat1, loc_hat1, scale_hat1),
    (b_hat2, loc_hat2, scale_hat2),
)
((4.0, 1.5, 2.0),
 (1.1847469403034272, 0.8491731695113298, 7.168808471117396),
 (1.3286667936902536, 1.5, 5.902455836679641))

10) Statistical use cases#

A) Hypothesis testing#

If a truncation point (B) is known (e.g., a fixed timeout), you can test hypotheses about the rate (\lambda) using a likelihood ratio test:

  • (H_0: \lambda=\lambda_0)

  • (H_1: \lambda) free

The test statistic is (2(\ell(\hat\lambda)-\ell(\lambda_0))), which is asymptotically (\chi^2_1) under regularity conditions.

B) Bayesian modeling#

Truncated exponentials show up naturally when a prior or likelihood is exponential-like but constrained:

  • bounded waiting times (timeouts)

  • bounded survival times due to study design

A Gamma prior for (\lambda) is no longer strictly conjugate because of the (\log(1-e^{-\lambda B})) term, but you can still compute a posterior numerically (grid, MCMC).

C) Generative modeling#

Use truncexpon to generate bounded, right-skewed features. It can also be a component of mixture models (e.g., two populations with different scales but the same timeout).

# A) Likelihood ratio test for λ with known truncation B
lam_true = 1.4
B = 2.5
n = 600

b = lam_true * B
x = truncexpon_rvs_np(n=n, b=b, loc=0.0, scale=1 / lam_true, rng=rng)

def mle_lambda_given_B(x: np.ndarray, B: float) -> float:
    def nll(lam: float) -> float:
        return -truncexp_loglik_lambda(x, lam=lam, B=B)

    res = minimize_scalar(nll, bounds=(1e-6, 50.0), method="bounded")
    return float(res.x)


lam_hat = mle_lambda_given_B(x, B=B)

lam0 = 1.0
ll_hat = truncexp_loglik_lambda(x, lam=lam_hat, B=B)
ll_0 = truncexp_loglik_lambda(x, lam=lam0, B=B)

lrt = 2 * (ll_hat - ll_0)
p_value = 1 - stats.chi2(df=1).cdf(lrt)

lam_true, lam_hat, lrt, p_value
(1.4, 1.4847492967374485, 45.83350937849366, 1.2874257215855778e-11)
# B) Bayesian grid posterior for λ (not conjugate due to truncation term)
x = x  # from the previous cell

# Prior: λ ~ Gamma(alpha0, beta0) with rate beta0
alpha0 = 2.0
beta0 = 1.0

lam_grid = np.linspace(0.05, 6.0, 600)

log_prior = stats.gamma(a=alpha0, scale=1 / beta0).logpdf(lam_grid)
log_like = np.array([truncexp_loglik_lambda(x, lam=float(l), B=B) for l in lam_grid])

log_post_unnorm = log_prior + log_like
log_post_unnorm -= log_post_unnorm.max()

post_unnorm = np.exp(log_post_unnorm)
post = post_unnorm / np.trapz(post_unnorm, lam_grid)

fig = px.line(x=lam_grid, y=post, title="Posterior over λ (grid approximation)")
fig.update_layout(xaxis_title="λ", yaxis_title="posterior density")
fig.add_vline(x=lam_true, line_dash="dot", opacity=0.35)
fig.add_vline(x=lam_hat, line_dash="dash", opacity=0.35)
fig.show()

lam_post_mean = float(np.trapz(lam_grid * post, lam_grid))
lam_post_mean
1.484737376630421
# C) Simple mixture generation example
n = 20_000
B = 2.0

lam1, lam2 = 0.8, 2.0
w = 0.6

u = rng.random(n)
comp = (u < w)

x1 = truncexpon_rvs_np(n=comp.sum(), b=lam1 * B, scale=1 / lam1, rng=rng)
x2 = truncexpon_rvs_np(n=(~comp).sum(), b=lam2 * B, scale=1 / lam2, rng=rng)
x_mix = np.empty(n)
x_mix[comp] = x1
x_mix[~comp] = x2

fig = px.histogram(x_mix, nbins=80, histnorm="probability density", title="Mixture of two truncated exponentials")
fig.update_layout(xaxis_title="x", yaxis_title="density")
fig.show()

11) Pitfalls#

  • Invalid parameters: b <= 0 or scale <= 0 is invalid. Always validate.

  • Truncation vs censoring:

    • truncated: you only see samples (\le B) because the model is conditional on that event

    • censored: you may observe “(>B)” without the exact value; the likelihood is different

  • Numerical stability:

    • use -expm1(-b) for (1-e^{-b}) when b is small

    • use log1p for (\log(1-u(1-e^{-b})))

    • prefer logpdf when densities are extremely small

  • Fitting:

    • estimating loc, scale, and b jointly can be unstable

    • when the truncation endpoint is known (timeouts), fix it via floc/fscale or by reparameterizing

  • Goodness-of-fit with fitted parameters: classical KS p-values are not exact after parameter fitting; consider bootstrap if you need calibrated p-values.

12) Summary#

  • truncexpon is a continuous distribution: an exponential waiting time conditioned to lie in a finite interval.

  • Canonical PDF/CDF on ([0,B]): (f(x)=\lambda e^{-\lambda x}/(1-e^{-\lambda B})), (F(x)=(1-e^{-\lambda x})/(1-e^{-\lambda B})).

  • Mean/variance have closed forms; skewness/kurtosis depend only on the dimensionless truncation (b=\lambda B).

  • Sampling is easy with inverse CDF and can be implemented with NumPy only.

  • In SciPy: stats.truncexpon(b, loc=..., scale=...) with support ([\mathrm{loc}, \mathrm{loc}+b,\mathrm{scale}]).

  • Inference often requires numerical optimization (MLE) or numerical Bayes (grids/MCMC) because truncation breaks some convenient conjugacies.